31 research outputs found

    Get Rid Of Your Trail: Remotely Erasing Backdoors in Federated Learning

    Full text link
    Federated Learning (FL) enables collaborative deep learning training across multiple participants without exposing sensitive personal data. However, the distributed nature of FL and the unvetted participants' data makes it vulnerable to backdoor attacks. In these attacks, adversaries inject malicious functionality into the centralized model during training, leading to intentional misclassifications for specific adversary-chosen inputs. While previous research has demonstrated successful injections of persistent backdoors in FL, the persistence also poses a challenge, as their existence in the centralized model can prompt the central aggregation server to take preventive measures to penalize the adversaries. Therefore, this paper proposes a methodology that enables adversaries to effectively remove backdoors from the centralized model upon achieving their objectives or upon suspicion of possible detection. The proposed approach extends the concept of machine unlearning and presents strategies to preserve the performance of the centralized model and simultaneously prevent over-unlearning of information unrelated to backdoor patterns, making the adversaries stealthy while removing backdoors. To the best of our knowledge, this is the first work that explores machine unlearning in FL to remove backdoors to the benefit of adversaries. Exhaustive evaluation considering image classification scenarios demonstrates the efficacy of the proposed method in efficient backdoor removal from the centralized model, injected by state-of-the-art attacks across multiple configurations

    HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis

    Full text link
    As the use of Large Language Models (LLMs) in text generation tasks proliferates, concerns arise over their potential to compromise academic integrity. The education sector currently tussles with distinguishing student-authored homework assignments from AI-generated ones. This paper addresses the challenge by introducing HowkGPT, designed to identify homework assignments generated by AI. HowkGPT is built upon a dataset of academic assignments and accompanying metadata [17] and employs a pretrained LLM to compute perplexity scores for student-authored and ChatGPT-generated responses. These scores then assist in establishing a threshold for discerning the origin of a submitted assignment. Given the specificity and contextual nature of academic work, HowkGPT further refines its analysis by defining category-specific thresholds derived from the metadata, enhancing the precision of the detection. This study emphasizes the critical need for effective strategies to uphold academic integrity amidst the growing influence of LLMs and provides an approach to ensuring fair and accurate grading in educational institutions

    Cryptoleq: A Heterogeneous Abstract Machine for Encrypted and Unencrypted Computation

    Get PDF
    The rapid expansion and increased popularity of cloud computing comes with no shortage of privacy concerns about outsourcing computation to semi-trusted parties. Leveraging the power of encryption, in this paper we introduce Cryptoleq: an abstract machine based on the concept of One Instruction Set Computer, capable of performing general-purpose computation on encrypted programs. The program operands are protected using the Paillier partially homomorphic cryptosystem, which supports addition on the encrypted domain. Full homomorphism over addition and multiplication, which is necessary for enabling general-purpose computation, is achieved by inventing a heuristically obfuscated software re-encryption module written using Cryptoleq instructions and blended into the executing program. Cryptoleq is heterogeneous, allowing mixing encrypted and unencrypted instruction operands in the same program memory space. Programming with Cryptoleq is facilitated using an enhanced assembly language that allows development of any advanced algorithm on encrypted datasets. In our evaluation, we compare Cryptoleq\u27s performance against a popular fully homomorphic encryption library, and demonstrate correctness using a typical Private Information Retrieval problem

    TERMinator Suite: Benchmarking Privacy-Preserving Architectures

    Get PDF
    Security and privacy are fundamental objectives characterizing contemporary cloud computing. Despite the wide adoption of encryption for protecting data in transit and at rest, data in use remains unencrypted inside cloud processors and memories, as computation is not applicable on encrypted values. This limitation introduces security risks, as unencrypted values can be leaked through side-channels or hardware Trojans. To address this problem, encrypted architectures have recently been proposed, which leverage homomorphic encryption to natively process encrypted data using datapaths of thousands of bits. In this case, additional security protections are traded for higher performance penalties, which drives the need for more efficient architectures. In this work, we develop benchmarks specifically tailored to encrypted computers, to enable comparisons across different architectures. Our benchmark suite, dubbed TERMinator, is unique as it avoids \u27termination problems\u27 that prohibit making control-flow decisions and evaluating early termination conditions based on encrypted data, as these can leak information. Contrary to generic suites that ignore the fundamental challenges of encrypted computation, our algorithms are tailored to the security primitives of the target encrypted architecture, such as the existence of branching oracles. In our experiments, we compiled our benchmarks for the Cryptoleq architecture and evaluated their performance for a range of security parameters

    Privacy-preserving machine learning for healthcare: open challenges and future perspectives

    Full text link
    Machine Learning (ML) has recently shown tremendous success in modeling various healthcare prediction tasks, ranging from disease diagnosis and prognosis to patient treatment. Due to the sensitive nature of medical data, privacy must be considered along the entire ML pipeline, from model training to inference. In this paper, we conduct a review of recent literature concerning Privacy-Preserving Machine Learning (PPML) for healthcare. We primarily focus on privacy-preserving training and inference-as-a-service, and perform a comprehensive review of existing trends, identify challenges, and discuss opportunities for future research directions. The aim of this review is to guide the development of private and efficient ML models in healthcare, with the prospects of translating research efforts into real-world settings.Comment: ICLR 2023 Workshop on Trustworthy Machine Learning for Healthcare (TML4H

    E3: A Framework for Compiling C++ Programs with Encrypted Operands

    Get PDF
    In this technical report we describe E3 (Encrypt-Everything-Everywhere), a framework which enables execution of standard C++ code with homomorphically encrypted variables. The framework automatically generates protected types so the programmer can remain oblivious to the underlying encryption scheme. C++ protected classes redefine operators according to the encryption scheme effectively making the introduction of a new API unnecessary. At its current version, E3 supports a variety of homomorphic encryption libraries, batching, mixing different encryption schemes in the same program, as well as the ability to combine modular computation and bit-level computation

    CoFHEE: A Co-processor for Fully Homomorphic Encryption Execution

    Full text link
    The migration of computation to the cloud has raised privacy concerns as sensitive data becomes vulnerable to attacks since they need to be decrypted for processing. Fully Homomorphic Encryption (FHE) mitigates this issue as it enables meaningful computations to be performed directly on encrypted data. Nevertheless, FHE is orders of magnitude slower than unencrypted computation, which hinders its practicality and adoption. Therefore, improving FHE performance is essential for its real world deployment. In this paper, we present a year-long effort to design, implement, fabricate, and post-silicon validate a hardware accelerator for Fully Homomorphic Encryption dubbed CoFHEE. With a design area of 12mm212mm^2, CoFHEE aims to improve performance of ciphertext multiplications, the most demanding arithmetic FHE operation, by accelerating several primitive operations on polynomials, such as polynomial additions and subtractions, Hadamard product, and Number Theoretic Transform. CoFHEE supports polynomial degrees of up to n=214n = 2^{14} with a maximum coefficient sizes of 128 bits, while it is capable of performing ciphertext multiplications entirely on chip for n≤213n \leq 2^{13}. CoFHEE is fabricated in 55nm CMOS technology and achieves 250 MHz with our custom-built low-power digital PLL design. In addition, our chip includes two communication interfaces to the host machine: UART and SPI. This manuscript presents all steps and design techniques in the ASIC development process, ranging from RTL design to fabrication and validation. We evaluate our chip with performance and power experiments and compare it against state-of-the-art software implementations and other ASIC designs. Developed RTL files are available in an open-source repository
    corecore